Skip to content

Add PostgreSQL replication provider#58

Merged
renecannao merged 2 commits intomasterfrom
issue53-postgresql-provider
Mar 24, 2026
Merged

Add PostgreSQL replication provider#58
renecannao merged 2 commits intomasterfrom
issue53-postgresql-provider

Conversation

@renecannao
Copy link
Copy Markdown

@renecannao renecannao commented Mar 24, 2026

Summary

  • Implements the DatabaseProvider interface for PostgreSQL streaming replication (go/inst/provider_postgresql.go)
  • Maps PostgreSQL WAL-based replication state (pg_stat_wal_receiver, pg_current_wal_lsn, pg_wal_replay_pause/resume) to the common ReplicationStatus struct
  • Adds PostgreSQLTopologyUser and PostgreSQLTopologyPassword configuration fields to go/config/config.go
  • Adds github.com/lib/pq as a vendored dependency for PostgreSQL connectivity
  • Updates docs/database-providers.md with PostgreSQL provider documentation including configuration, activation, and behavioral differences from MySQL

Test plan

  • go build -o /dev/null ./go/cmd/orchestrator compiles successfully
  • go test ./go/inst/... -run TestPostgreSQL -v -- all 3 tests pass (provider name, interface compliance, non-nil constructor)
  • Manual verification against a PostgreSQL streaming replication cluster

Closes #53

Summary by CodeRabbit

  • New Features
    • Added PostgreSQL support for streaming replication topologies alongside existing database providers.
    • New configuration options for PostgreSQL connectivity including user credentials and SSL mode settings.
    • PostgreSQL replication monitoring capabilities with status tracking and read-only mode management.

Implement the DatabaseProvider interface for PostgreSQL streaming
replication topologies. The provider uses lib/pq to connect to
PostgreSQL instances and maps WAL-based replication state to the
common ReplicationStatus struct.

- GetReplicationStatus: queries pg_stat_wal_receiver (standby) or
  pg_current_wal_lsn (primary)
- IsReplicaRunning: checks WAL receiver status
- SetReadOnly/IsReadOnly: manages default_transaction_read_only
- StartReplication: resumes WAL replay (streaming is automatic)
- StopReplication: pauses WAL replay via pg_wal_replay_pause()

Also adds PostgreSQLTopologyUser and PostgreSQLTopologyPassword
config fields, unit tests, and updated documentation.

Closes #53
Copilot AI review requested due to automatic review settings March 24, 2026 11:55
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adding PostgreSQL support to orchestrator by implementing a PostgreSQLProvider satisfying the DatabaseProvider interface. This includes configuration extensions for PostgreSQL topology credentials, replication status monitoring via WAL receiver queries, and vendoring the lib/pq PostgreSQL driver.

Changes

Cohort / File(s) Summary
PostgreSQL Provider Implementation
go/inst/provider_postgresql.go, go/inst/provider_postgresql_test.go
New PostgreSQLProvider type implementing the DatabaseProvider interface with methods to retrieve replication status from pg_stat_wal_receiver and pg_last_wal_replay_lsn(), check replica running status, control replication via pg_wal_replay_pause/resume(), and manage read-only mode via ALTER SYSTEM.
Configuration Extensions
go/config/config.go
Added exported fields PostgreSQLTopologyUser, PostgreSQLTopologyPassword, and PostgreSQLSSLMode to Configuration struct with PostgreSQLSSLMode defaulting to "require".
Documentation
docs/database-providers.md
Added PostgreSQL Provider section documenting configuration JSON fields, provider activation via inst.SetProvider(), mapping of provider operations to PostgreSQL queries/system calls, and behavioral differences from MySQL regarding WAL receiver and replication control semantics.
PostgreSQL Driver (Vendor)
vendor/github.com/lib/pq/*, go.mod, vendor/modules.txt
Added github.com/lib/pq v1.12.0 as a direct dependency with complete driver implementation including connection management, prepared statements, SSL/TLS support, COPY protocol, error handling, LISTEN/NOTIFY, array/type encoding, and configuration parsing.

Sequence Diagram

sequenceDiagram
    participant Orchestrator as Orchestrator
    participant Provider as PostgreSQLProvider
    participant PgConn as PostgreSQL Connection
    participant WALRcv as WAL Receiver
    participant PgDB as PostgreSQL Backend

    Orchestrator->>Provider: GetReplicationStatus(key)
    Provider->>PgConn: openPostgreSQLTopology()
    PgConn->>PgDB: CONNECT

    Provider->>PgDB: pg_is_in_recovery()
    PgDB-->>Provider: true (standby)

    Provider->>PgDB: SELECT * FROM pg_stat_wal_receiver
    PgDB-->>WALRcv: WAL receiver status row
    WALRcv-->>Provider: status, lsn

    Provider->>PgDB: pg_last_wal_replay_lsn()
    PgDB-->>Provider: current LSN position

    Provider->>PgDB: pg_last_xact_replay_timestamp()
    PgDB-->>Provider: replay timestamp

    Provider->>Provider: Calculate lag, populate ReplicationStatus
    Provider-->>Orchestrator: ReplicationStatus{<br/>ReplicaRunning: true,<br/>ReplayLag: calculated<br/>}
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A new dialect whispers through the wire,
PostgreSQL joins the replication choir!
WAL logs dance, slots synchronize with grace,
Our fuzzy orchestrator finds a brand new place.
Thump thump goes the heartbeat, LSN by LSN! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add PostgreSQL replication provider' clearly summarizes the main change—introducing PostgreSQL support to the replication provider system.
Linked Issues check ✅ Passed The PR implements the DatabaseProvider interface for PostgreSQL, supports discovery of replication state via WAL queries, and includes LSN-based position tracking, addressing all core objectives from issue #53.
Out of Scope Changes check ✅ Passed All changes directly support PostgreSQL provider implementation: config fields, provider implementation, documentation, vendor dependency, and tests. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue53-postgresql-provider

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new PostgreSQL DatabaseProvider implementation so orchestrator can interact with PostgreSQL streaming replication via the provider abstraction layer (alongside the existing MySQL provider).

Changes:

  • Introduces PostgreSQLProvider implementing DatabaseProvider, plus minimal unit tests.
  • Extends orchestrator configuration with PostgreSQL topology credentials.
  • Vendors github.com/lib/pq and documents PostgreSQL provider usage/behavior.

Reviewed changes

Copilot reviewed 5 out of 53 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
vendor/modules.txt Adds github.com/lib/pq to vendored module list.
vendor/github.com/lib/pq/stmt.go Vendored pq driver source.
vendor/github.com/lib/pq/staticcheck.conf Vendored pq staticcheck configuration.
vendor/github.com/lib/pq/ssl.go Vendored pq TLS handling.
vendor/github.com/lib/pq/scram/scram.go Vendored pq SCRAM implementation.
vendor/github.com/lib/pq/rows.go Vendored pq rows decoding logic.
vendor/github.com/lib/pq/quote.go Vendored pq SQL quoting helpers.
vendor/github.com/lib/pq/pqerror/pqerror.go Vendored pq error-code helpers.
vendor/github.com/lib/pq/pqerror/codes.go Vendored generated PostgreSQL error codes.
vendor/github.com/lib/pq/oid/types.go Vendored generated OID constants/type names.
vendor/github.com/lib/pq/oid/doc.go Vendored OID package docs/types.
vendor/github.com/lib/pq/notify.go Vendored LISTEN/NOTIFY support.
vendor/github.com/lib/pq/notice.go Vendored NOTICE handler plumbing.
vendor/github.com/lib/pq/krb.go Vendored GSS provider registration interface.
vendor/github.com/lib/pq/internal/proto/sz_64.go Vendored protocol size constants (non-32bit).
vendor/github.com/lib/pq/internal/proto/sz_32.go Vendored protocol size constants (32bit).
vendor/github.com/lib/pq/internal/proto/proto.go Vendored protocol constants/types.
vendor/github.com/lib/pq/internal/pqutil/user_windows.go Vendored user lookup (Windows).
vendor/github.com/lib/pq/internal/pqutil/user_posix.go Vendored user lookup (POSIX).
vendor/github.com/lib/pq/internal/pqutil/user_other.go Vendored user lookup (other platforms).
vendor/github.com/lib/pq/internal/pqutil/pqutil.go Vendored misc helpers.
vendor/github.com/lib/pq/internal/pqutil/perm_unsupported.go Vendored SSL key permission handling (unsupported OSes).
vendor/github.com/lib/pq/internal/pqutil/perm.go Vendored SSL key permission checks.
vendor/github.com/lib/pq/internal/pqutil/path.go Vendored home/.pgpass path logic.
vendor/github.com/lib/pq/internal/pqtime/pqtime.go Vendored timestamp parsing/formatting.
vendor/github.com/lib/pq/internal/pqtime/loc.go Vendored timezone cache.
vendor/github.com/lib/pq/internal/pqsql/copy.go Vendored COPY detection helper.
vendor/github.com/lib/pq/internal/pgservice/pgservice.go Vendored service-file support.
vendor/github.com/lib/pq/internal/pgpass/pgpass.go Vendored .pgpass parsing.
vendor/github.com/lib/pq/error.go Vendored pq Error type and error handling.
vendor/github.com/lib/pq/encode.go Vendored text/binary encoding/decoding.
vendor/github.com/lib/pq/doc.go Vendored pq package documentation.
vendor/github.com/lib/pq/deprecated.go Vendored deprecated APIs/aliases.
vendor/github.com/lib/pq/copy.go Vendored COPY FROM STDIN implementation.
vendor/github.com/lib/pq/conn_go18.go Vendored context-aware driver interfaces/cancel.
vendor/github.com/lib/pq/compose.yaml Vendored pq dev/test compose file.
vendor/github.com/lib/pq/buf.go Vendored protocol read/write buffers.
vendor/github.com/lib/pq/as_go126.go Vendored go1.26 build-tag helper.
vendor/github.com/lib/pq/as.go Vendored As() helper (pre-go1.26).
vendor/github.com/lib/pq/array.go Vendored array scanning/encoding.
vendor/github.com/lib/pq/README.md Vendored pq README.
vendor/github.com/lib/pq/LICENSE Vendored pq license.
vendor/github.com/lib/pq/CHANGELOG.md Vendored pq changelog.
vendor/github.com/lib/pq/.gitignore Vendored pq gitignore.
vendor/github.com/lib/pq/.gitattributes Vendored pq gitattributes.
go/inst/provider_postgresql_test.go Adds basic tests for provider name/interface/non-nil constructor.
go/inst/provider_postgresql.go Implements PostgreSQL provider using database/sql + lib/pq.
go/config/config.go Adds PostgreSQL topology credential fields to Configuration.
go.sum Adds lib/pq checksums.
go.mod Adds lib/pq dependency.
docs/database-providers.md Documents PostgreSQL provider configuration/activation/behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +118 to +125
| Operation | PostgreSQL Implementation |
|---------------------|---------------------------------------------------------------|
| GetReplicationStatus | Queries `pg_stat_wal_receiver` (standby) or `pg_current_wal_lsn()` (primary). Reports WAL LSN as position and `replay_lag` as lag. |
| IsReplicaRunning | Checks `pg_stat_wal_receiver` for an active WAL receiver with `status = 'streaming'`. |
| SetReadOnly | Runs `ALTER SYSTEM SET default_transaction_read_only = on/off` followed by `SELECT pg_reload_conf()`. |
| IsReadOnly | Queries `SHOW default_transaction_read_only`. |
| StartReplication | Calls `SELECT pg_wal_replay_resume()`. Streaming replication itself starts automatically when the standby connects. |
| StopReplication | Calls `SELECT pg_wal_replay_pause()` to pause WAL replay. The WAL receiver remains connected. |
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table claims GetReplicationStatus reports replay_lag as lag, but the current provider implementation runs on the standby and will generally not be able to read replay_lag (it comes from pg_stat_replication on the primary). Please adjust the documentation to match the actual lag source, or update the provider to compute/report lag in the documented way.

Copilot uses AI. Check for mistakes.
Comment thread go/inst/provider_postgresql.go Outdated
Comment on lines +45 to +49
cfg := config.Config
connStr := fmt.Sprintf(
"host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable connect_timeout=5",
hostname, port, cfg.PostgreSQLTopologyUser, cfg.PostgreSQLTopologyPassword,
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PostgreSQL DSN is built via fmt.Sprintf with raw user/password values. In lib/pq key=value DSNs, values containing spaces/quotes/backslashes need quoting/escaping; otherwise passwords like p@ss word or values containing sslmode= can break the connection string (or change its semantics). Consider constructing the DSN via net/url (postgres://user:pass@host:port/db?...) or using pq's Config/Connector API so credentials are properly escaped.

Copilot uses AI. Check for mistakes.
Comment thread go/inst/provider_postgresql.go Outdated
func openPostgreSQLTopology(hostname string, port int) (*sql.DB, error) {
cfg := config.Config
connStr := fmt.Sprintf(
"host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable connect_timeout=5",
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DSN hardcodes sslmode=disable, which forces plaintext connections even when TLS is available/required. This is a security regression compared to letting the driver default or allowing configuration. Please make sslmode (and related TLS options) configurable, or at minimum avoid forcing disable so deployments can use require/verify-full as needed.

Suggested change
"host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable connect_timeout=5",
"host=%s port=%d user=%s password=%s dbname=postgres connect_timeout=5",

Copilot uses AI. Check for mistakes.
Comment thread go/inst/provider_postgresql.go Outdated
Comment on lines +44 to +57
func openPostgreSQLTopology(hostname string, port int) (*sql.DB, error) {
cfg := config.Config
connStr := fmt.Sprintf(
"host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable connect_timeout=5",
hostname, port, cfg.PostgreSQLTopologyUser, cfg.PostgreSQLTopologyPassword,
)
db, err := sql.Open("postgres", connStr)
if err != nil {
return nil, err
}
db.SetMaxOpenConns(3)
db.SetMaxIdleConns(1)
return db, nil
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openPostgreSQLTopology creates a new *sql.DB for every call and closes it at the end of each provider method. Since provider methods are called frequently during discovery/polling, this defeats connection pooling and can create significant connection churn/load on PostgreSQL. Consider caching sql.DB per host:port (similar to sqlutils.GetGenericDB), or maintaining a long-lived pool and not closing it per request.

Copilot uses AI. Check for mistakes.
Comment thread go/inst/provider_postgresql.go Outdated
Comment on lines +60 to +61
// On a standby it queries pg_stat_wal_receiver; on a primary it queries
// pg_stat_replication.
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the primary path queries pg_stat_replication, but the implementation only queries pg_current_wal_lsn(). Either update the comment to match behavior, or implement the intended pg_stat_replication-based mapping (e.g., to expose downstream replica lag/health on primaries).

Suggested change
// On a standby it queries pg_stat_wal_receiver; on a primary it queries
// pg_stat_replication.
// On a standby it queries pg_stat_wal_receiver; on a primary it derives status
// using pg_current_wal_lsn() and related functions.

Copilot uses AI. Check for mistakes.
Comment thread go/inst/provider_postgresql.go Outdated
Comment on lines +91 to +93
EXTRACT(EPOCH FROM replay_lag)
FROM pg_stat_wal_receiver w
LEFT JOIN pg_stat_replication r ON true
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This standby query extracts replay_lag by LEFT JOINing pg_stat_replication, but pg_stat_replication is a primary view and will be empty on standbys, so lagSeconds will typically be NULL and reported as unknown (-1). If you want standby lag, consider computing it from standby-side functions (e.g., now() - pg_last_xact_replay_timestamp()), or from pg_stat_wal_receiver.latest_end_time, and remove the pg_stat_replication join.

Suggested change
EXTRACT(EPOCH FROM replay_lag)
FROM pg_stat_wal_receiver w
LEFT JOIN pg_stat_replication r ON true
EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp())
FROM pg_stat_wal_receiver w

Copilot uses AI. Check for mistakes.
return log.Errore(err)
}
defer db.Close()

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartReplication always calls pg_wal_replay_resume(). On a primary (not in recovery) this function errors, so callers that accidentally invoke StartReplication on a primary will get a failure. Consider checking pg_is_in_recovery() first (or reusing the same inRecovery check as GetReplicationStatus) and either no-op or return a clearer error when not a standby.

Suggested change
// Only standbys (in recovery) support WAL replay control functions.
var inRecovery bool
if err := db.QueryRow("SELECT pg_is_in_recovery()").Scan(&inRecovery); err != nil {
return log.Errore(err)
}
if !inRecovery {
log.Infof("StartReplication called on primary %s:%d; instance is not in recovery, so WAL replay resume is not applicable", key.Hostname, key.Port)
return nil
}

Copilot uses AI. Check for mistakes.
return log.Errore(err)
}
defer db.Close()

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StopReplication always calls pg_wal_replay_pause(), which will error on primaries (not in recovery). Similar to StartReplication, consider guarding with pg_is_in_recovery() and no-op / return a clearer error when invoked on a primary.

Suggested change
var inRecovery bool
if err := db.QueryRow("SELECT pg_is_in_recovery()").Scan(&inRecovery); err != nil {
return log.Errore(err)
}
if !inRecovery {
return log.Errore(fmt.Errorf("cannot pause WAL replay on primary %s:%d (not in recovery)", key.Hostname, key.Port))
}

Copilot uses AI. Check for mistakes.
@gemini-code-assist
Copy link
Copy Markdown

Warning

Gemini is experiencing higher than usual traffic and was unable to create the summary. Please try again in a few hours by commenting /gemini summary.

@renecannao renecannao merged commit 95e1966 into master Mar 24, 2026
4 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PostgreSQL replication provider

2 participants